Search CORE

51 research outputs found

Local Statistical Modeling via Cluster-Weighted Approach with Elliptical Distributions

Author: Giorgio Vittadini
Salvatore Ingrassia
Simona Caterina Minotti
Publication venue
Publication date
Field of study

Cluster Weighted Modeling (CWM) is a mixture approach regarding the modelisation of the joint probability of data coming from a heterogeneous population. Under Gaussian assumptions, we investigate statistical properties of CWM from both the theoretical and numerical point of view; in particular, we show that CWM includes as special cases mixtures of distributions and mixtures of regressions. Further, we introduce CWM based on Student-t distributions providing more robust fitting for groups of observations with longer than normal tails or atypical observations. Theoretical results are illustrated using some empirical studies, considering both real and simulated data.Cluster-Weighted Modeling, Mixture Models, Model-Based Clustering

Research Papers in Economics

flexCWM: A Flexible Framework for Cluster-Weighted Models

Author: Angelo Mazza
Antonio Punzo
Salvatore Ingrassia
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/09/2018
Field of study

Cluster-weighted models (CWMs) are mixtures of regression models with random covariates. However, besides having recently become rather popular in statistics and data mining, there is still a lack of support for CWMs within the most popular statistical suites. In this paper, we introduce flexCWM, an R package specifically conceived for fitting CWMs. The package supports modeling the conditioned response variable by means of the most common distributions of the exponential family and by the t distribution. Covariates are allowed to be of mixed-type and parsimonious modeling of multivariate normal covariates, based on the eigenvalue decomposition of the component covariance matrices, is supported. Furthermore, either the response or the covariates distributions can be omitted, yielding to mixtures of distributions and mixtures of regression models with fixed covariates, respectively. The expectation-maximization (EM) algorithm is used to obtain maximum-likelihood estimates of the parameters and likelihood-based information criteria are adopted to select the number of groups and/or a parsimonious model. For the component regression coefficients, standard errors and significance tests are also provided. Parallel computation can be used on multicore PCs and computer clusters, when several models have to be fitted. To exemplify the use of the package, applications to artificial and real datasets, included in the package, are presented

Directory of Open Access Journals

Journal of Statistical Software

The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers.

Author: García-Escudero Luis Ángel
Gordaliza Alfonso
Greselin Francesca
Ingrassia Salvatore
Mayo Iscar Agustín
Publication venue: ELSEVIER
Publication date: 15/12/1998
Field of study

Producción CientíficaMixtures of Gaussian factors are powerful tools for modeling an unobserved heterogeneous population, offering – at the same time – dimension reduction and model-based clustering. The high prevalence of spurious solutions and the disturbing effects of outlying observations in maximum likelihood estimation may cause biased or misleading inferences. Restrictions for the component covariances are considered in order to avoid spurious solutions, and trimming is also adopted, to provide robustness against violations of normality assumptions of the underlying latent factors. A detailed AECM algorithm for this new approach is presented. Simulation results and an application to the AIS dataset show the aim and effectiveness of the proposed methodology.Ministerio de Economía y Competitividad and FEDER, grant MTM2014-56235-C2-1-P, and by Consejería de Educación de la Junta de Castilla y León, grant VA212U13, by grant FAR 2015 from the University of Milano-Bicocca and by grant FIR 2014 from the University of Catania

Crossref

Repositorio Documental de la Universidad de Valladolid

Oskar Bordeaux

Robust estimation for mixtures of Gaussian factor analyzers, based on trimming and constraints

Author: García Escudero Luis Ángel
Gordaliza Ramos Alfonso
Greselin Francesca
Mayo Iscar Agustín
Salvatore Ingrassia
Publication venue: Universidad de Valladolid. Facultad de Medicina
Publication date: 01/01/2015
Field of study

Producción CientíficaMixtures of Gaussian factors are powerful tools for modeling an unobserved heterogeneous population, offering - at the same time - dimension reduction and model-based clustering. Unfortunately, the high prevalence of spurious solutions and the disturbing effects of outlying observations, along maximum likelihood estimation, open serious issues. In this paper we consider restrictions for the component covariances, to avoid spurious solutions, and trimming, to provide robustness against violations of normality assumptions of the underlying latent factors. A detailed AECM algorithm for this new approach is presented. Simulation results and an application to the AIS dataset show the aim and effectiveness of the proposed methodology

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

Robust estimation of mixtures of regressions with random covariates, via trimming and constraints

Author: García Escudero Luis Ángel
Gordaliza Ramos Alfonso
Greselin Francesca
Ingrassia Salvatore
Mayo Iscar Agustín
Publication venue: Universidad de Valladolid. Facultad de Medicina
Publication date: 01/01/2015
Field of study

Producción CientíficaA robust estimator for a wide family of mixtures of linear regression is presented. Robustness is based on the joint adoption of the Cluster Weighted Model and of an estimator based on trimming and restrictions. The selected model provides the conditional distribution of the response for each group, as in mixtures of regression, and further supplies local distributions for the explanatory variables. A novel version of the restrictions has been devised, under this model, for separately controlling the two sources of variability identified in it. This proposal avoids singularities in the log-likelihood, caused by approximate local collinearity in the explanatory variables or local exact fits in regressions, and reduces the occurrence of spurious local maximizers. In a natural way, due to the interaction between the model and the estimator, the procedure is able to resist the harmful influence of bad leverage points along the estimation of the mixture of regressions, which is still an open issue in the literature. The given methodology defines a well-posed statistical problem, whose estimator exists and is consistent to the corresponding solution of the population optimum, under widely general conditions. A feasible EM algorithm has also been provided to obtain the corresponding estimation. Many simulated examples and two real datasets have been chosen to show the ability of the procedure, on the one hand, to detect anomalous data, and, on the other hand, to identify the real cluster regressions without the influence of contamination. Keywords Cluster Weighted Modeling · Mixture of Regressions · Robustnes

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers.

Author: García Escudero Luis Ángel
Gordaliza Ramos Alfonso
Greselin Francesca
Ingrassia Salvatore
Mayo Iscar Agustín
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Repositorio Documental de la Universidad de Valladolid